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Abstract 

The impact of fragmentation by human activities on genetic diversity of forest 
trees is an important concern in forest conservation, especially in tropical for- 
ests. Dysoxylum malabaricum (white cedar) is an economically important tree 
species, endemic to the Western Ghats, India, one of the world's eight most 
important biodiversity hotspots. As D. malabaricum is under pressure of distur- 
bance and fragmentation together with overharvesting, conservation efforts are 
required in this species. In this study, range-wide genetic structure of twelve 
D. malabaricum populations was evaluated to assess the impact of human activ- 
ities on genetic diversity and infer the species' evolutionary history, using both 
nuclear and chloroplast (cp) DNA simple sequence repeats (SSR). As genetic 
diversity and population structure did not differ among seedling, juvenile and 
adult age classes, reproductive success among the old-growth trees and long dis- 
tance seed dispersal by hornbills were suggested to contribute to maintain 
genetic diversity. The fixation index (F IS ) was significantly correlated with lati- 
tude, with a higher level of inbreeding in the northern populations, possibly 
reflecting a more severe ecosystem disturbance in those populations. Both 
nuclear and cpSSRs revealed northern and southern genetic groups with some 
discordance of their distributions; however, they did not correlate with any of 
the two geographic gaps known as genetic barriers to animals. Approximate 
Bayesian computation-based inference from nuclear SSRs suggested that popu- 
lation divergence occurred before the last glacial maximum. Finally we dis- 
cussed the implications of these results, in particular the presence of a clear 
pattern of historical genetic subdivision, on conservation policies. 
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Introduction 

Forest fragmentation caused by changes in human land 
use is of primary concern for sustainability and conserva- 
tion biology in terrestrial ecosystems across the Earth 
(Aguilar et al. 2006) and, especially, in many tropical 
countries that have been experiencing rapid population 
growth over the last decades (Abdullah and Nakagoshi 
2007). Conservation of tropical trees is particularly 
important as they provide habitats and ecological niches 
for thousands of species (Hamilton 1999). As fragmentation 



restricts pollen and seed dispersal, it modifies gene flow 
and alters historical patterns of genetic subdivision 
(Murawski et al. 1994). Hence, anthropogenic landscape 
change and habitat fragmentation may threaten the 
genetic connectivity of many plant species and ultimately 
lead to their disappearance as isolated populations are at 
risk of losing genetic diversity that is critical to their 
long-term survival (Sork and Smouse 2006). Thus, the 
evaluation of the impact of fragmentation on genetic 
diversity of forest trees has been one of the main topics 
in forest conservation for many years (Murawski et al. 
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1994; Hamilton 1999; Bacles et al. 2006; Jump and 
Penuelas 2006). 

The Western Ghats region is a long mountainous 
massif (8-22°N, 73-77°E) that runs along the entire west 
coast of peninsular India (Fig. 1 Kodandapani et al. 
2004). Together with Sri Lanka, the Western Ghats is one 
of the world's eight most important biodiversity hotspots 
based on exceptional endemism and conservation need 
(Myers et al. 2000). However, the Western Ghats faces 
severe threats from human disturbance due to deforesta- 
tion, development activities, conversion of forests to plan- 
tations, and habitat fragmentation (Raman 2006), while 
natural reserves in the area are limited in size and frag- 
mented by an intervening matrix of agricultural land and 
tree plantations (Kodandapani et al. 2004). Indeed, 
Menon and Bawa (1997) estimated that the natural vege- 
tation of the Western Ghats has declined by 40% during 
the period 1920-1990, resulting in a fourfold increase in 
the number of fragments and an 83% reduction in size of 
surviving forest patches (Raman and Mudappa 2003). In 
the southern part of Western Ghats, Jha et al. (2000) 
detected a loss of 25.6% in forest cover over the period 
1973-1995. One of the major causes of forest fragmenta- 
tion in the Western Ghats is the spread of plantations, 
particularly, tea, coffee, and Eucalyptus (Raman 2006). 
Although India's net forest cover has actually increased 
since the 1990s due to agro-forestry plantations, social 



forestry, and mass afforestation (Ravikanth et al. 2000), 
the health of the native forests has actually decreased 
through reduction of canopy cover and forest density. 
Therefore, forest degradation and overexploitation of 
individual species is still an ongoing process in a large 
proportion of the existing forests (Rawat and Kishwan 
2008), and it justifies conservation efforts for tree species 
of the Western Ghats. As genetic diversity of historical 
lineages cannot be recovered if it is lost (Moritz 2002), 
assessment of conservation genetics of species for which 
there is significant risk of diversity loss is essential to 
maintain their evolutionary integrity (e.g., Frankel 1974; 
Crandall et al. 2000; Moritz 2002). As the importance of 
biogeographical considerations in attempts to conserve 
populations of diverse organisms is also becoming 
increasingly clear (Whittaker et al. 2005), a wide-scale 
genetic survey of the target species complete range and 
the identification of the main historical lineages of the 
species area crucial first step. 

Unfortunately only a handful of studies have examined 
genetic variation of plant species in the Western Ghats 
(Deshpande et al. 2001; Nageswara Rao et al. 2001, 2007; 
Padmini et al. 2001; Ravikanth et al. 2001, 2008; Bahuli- 
kar et al. 2004; Ramesha et al. 2007; Ismail et al. 2012), 
and none of them have examined wide-scale genetic 
structure and population demography of plant species. In 
particular, one could expect to find distinct population 




Figure 1. The location of the 12 Dysoxyk 
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structure as the Western Ghats has two old (500 million 
years) geographical gaps in the southern part, the Palghat 
and Shencottah gaps (Robin et al. 2010). The former is a 
30-40 km wide valley stretching from the west coast 
inwards at 11°N and is the largest disruption in this con- 
tinuous mountain range while the latter is more narrow 
(7.5 km at 9°N). Although the age and origin of the Palg- 
hat gap is still controversial (D'Cruz et al. 2000), its 
impact on population genetic structure has been shown 
in three recent studies in elephants (Vidya et al. 2005), 
montane birds (Robin et al. 2010), and frogs (Nair et al. 
2012). Other than that, very little is known on the impact 
of the topography of the Western Ghats on genetic struc- 
ture, especially in plants, and it is not known whether 
these gaps could be considered as boundaries for conser- 
vation units of plant species. 

White cedar (Dysoxylum malabaricum Bedd. [Melia- 
ceae] ) is a large canopy tree found in evergreen and semi- 
evergreen forests of 200-1200 m altitude in the Western 
Ghats. It grows to a height of 30-40 m or more, and 
3-4 m girth (Shivanna et al. 2003). D. malabaricum is 
primarily an outbreeding species, but rare cases of selfing 
are also observed (Ismail et al. 2012). It is an economi- 
cally important tree species endemic to the Western 
Ghats. Among many local tribes, D. malabaricum is a 
sacred tree used for medicinal, nutritional, commercial, 
and religious purposes (Kumar 2009). Its lustrous and 
sweet-scented wood is highly valued for various wood- 
works (Shivanna et al. 2003) and its fruits and wood are 
also harvested for use in traditional medicine. In fact, 
recent research has shown that compounds in D. mala- 
baricum may be effective against malaria mosquitoes 
(Nathan et al. 2006). In the Western Ghats, forests are a 
patchwork of state, community, and privately owned 
land. D. malabaricum is protected by law in state-owned 
forests and may not be harvested there. In private forests 
it may be harvested by locals after permission from the 
Forest department, which also involves a fee, and they are 
harvested typically after they attain a girth of 180 cm 
(Menon and Balasubramanyan 2006). However, due to its 
high economic value, the natural range of D. malabaricum 
has been heavily fragmented (Shivanna et al. 2003). Although 
D. malabaricum has not yet been assessed by the IUCN 
Red List, it is has already been categorized as Endangered 
(EN) under the Indian National Threat Assessment using 
the same criteria as the IUCN (Ravikumar and Ved 
2000). Regeneration is poor in this species, and a loss of 
juvenile individuals was prominent in a study conducted 
in Navangere, one of the northernmost populations 
(Shivanna et al. 2003; Y. Tsuda, pers. obs.). In line with 
this, another study in Coorg, in the central Western 
Ghats, detected almost no young adult individuals of 
D. malabaricum (S. A. Ismail, J. Ghazoul, G. Ravikanth, 



R. Uma Shaanker, C. G. Kushalappa, C. J. Kettle, sub- 
mitted; Khan 2007). The absence of young individuals 
could be due to not only the overharvest of wood and 
fruit (Shivanna et al. 2003) but also to the combined 
impact of recent change in land use and fragmentation, 
and its indirect effect on niche competition in the ecosys- 
tem. In particular, the lack of canopy trees in fragmented 
forests caused by human land use decreases D. malabari- 
cum seedling survival especially in the summer season. 
Moreover, predators of seeds of D. malabaricum have fre- 
quently been observed lately (G. Ravikanth, pers. obs.). It 
therefore seems that anthropogenic changes create a cas- 
cade of events at the local scale, which in turn could 
affect wide-scale genetic patterns. In light of this, it is 
important to evaluate genetic diversity at the species level 
and the historical connectivity among populations. 
Regarding the genetic connectivity and forest fragmenta- 
tion, D. malabaricum is insect pollinated, and seeds are 
dispersed mainly via the Malabar grey hornbill {Ocyceros 
griseus) and occasionally by other large birds such as 
imperial pigeons and wood pigeons (Ganesh and Davidar 
2001; Shivanna et al. 2003). 

A study on fine-scale spatial genetic structure (SGS) and 
paternity analysis by Ismail et al. (2012) in the Coorg dis- 
trict revealed that the majority of pollination events 
occurred within sacred groves and did not go beyond 
290 m. Furthermore, the proportion of short-distance 
mating events (<100 m) was much larger in forest patches 
with low D. malabaricum density than in stands with 
larger population sizes. Although no difference in genetic 
diversity could be detected among life stages (adults, 
saplings, seedlings, and embryos), there was a significant 
increase in relatedness among juveniles within the shortest 
distance class (<100 m). This was interpreted as the first 
signs of increased inbreeding due to fragmentation. 
Although gene flow between neighboring forest patches 
did occur, and the maximum distance of pollination 
detected was 23.6 km, it is not known whether this is 
enough to maintain genetic connectivity and population 
genetic structure at a larger scale still needs to be analyzed. 

In the present study, we used 1 1 species-specific nuclear 
SSR (nSSR; simple sequence repeat) markers (Hemmila 
et al. 2010) and one chloroplast SSR (cpSSR, Weising and 
Gardner 1999) to investigate the genetic diversity and 
structure of D. malabaricum across its distribution range 
and discuss implications for conservation efforts in Western 
Ghats. More specifically we (1) compare population genetic 
structure at biparentally inherited nSSRs (dispersed through 
pollen and seeds) and maternally inherited cpSSR (dispersed 
only through seeds), (2) assess the impact of the Palghat 
and Shencottah gaps on genetic structure, (3) infer popu- 
lation demographic history by approximate Bayesian 
computation (ABC) approach and finally, (4) based on 
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these results offer some suggestions for conservation and 
management strategies. 

Materials and Methods 
Study site 

Leaf samples from 343 individuals were collected in July 
2010 from twelve populations located at 8.5°N-14.9°N in 
the Western Ghats (Fig. 1) and categorized according to 
the tree's height and diameter at breast height (dbh). The 
samples include 167 adults (dbh more than 10 cm), 72 
juveniles (height 1 m or above, but dbh <10 cm), and 
104 seedlings (height <1 m, i.e., 1 or 2 years old). Adults 
were sampled in all populations, whereas juveniles were 
sampled in five populations and seedlings in seven. In 
total, 8-51 individuals were sampled in each population 
(Table 1). Populations represented different degrees of 
disturbance, ranging from small and disturbed popula- 
tions to large and protected ones. Sampling was per- 
formed in quadrats of size 10 x 10 m. The number of 
quadrats depended on the population size and was lim- 
ited to 20 in large populations (>100 individuals). Leaves 
were collected from each individual and stored at — 80°C 
prior to DNA extraction. 

DNA extraction and amplification 

DNA was extracted using a modified CTAB protocol 
(Doyle and Doyle 1987). As the quality of the extracted 
DNA was not satisfactory, whole genome of each 
extracted DNA sample was amplified by using Illustra 



Genomiphi DNA amplification V2 kit (GE Healthcare 
Limited, Buckinghamshire, U.K.). Eleven species-specific 
nSSRs (locus Dysmal 1, 2, 3, 7, 9, 13, 14, 17, 18, 22, and 
26; Hemmila et al. 2010) and one cpSSR (cpSSR; locus 
ccmp7; Weising and Gardner 1999) were examined in this 
study. Initially, five "universal" cpSSR loci (ccmp 2, 4, 5, 
7, and 10; Weising and Gardner 1999) were screened, but 
only one amplified in D. malabaricum. The primer pairs 
of the selected loci were mixed into four multiplex sets 
and amplified using Type-it Microsatellite PCR kit (Qia- 
gen, Venlo, Netherlands) in 6.0 j.tL mixtures containing 
1.2 [iL of 1-10 ng of genomic DNA, 3.0 /(L of Multiplex 
PCR master mix buffer, 1.2 /iL of H 2 0, and 0.6 /.tL of 
primer mix (with the concentration of each primer pair 
adjusted to 1-2 jimol/V). Samples were amplified by a 
DNA thermal cycler (Takara Bio Inc., Shiga, Japan) using 
the following program: initiation of hot-start DNA poly- 
merase and denaturation at 95°C for 15 min; 32 cycles of 
95°C for 30 sec, 57°C for 30 sec, and 72°C for 30 sec; 
and a final 30 min extension step at 72°C. PCR products 
were loaded on a MegaBACElOOO (GE Healthcare Life 
Science) and genotyped using the FRAGMENT PROFIL- 
ER v. 1.2 software (Amersham Biosciences, Buckingham- 
shire, U.K.). In cases where amplification and/or 
genotyping failed, the procedure was tried one more time 
to avoid missing data. Three genotype data sets were 
created; one for all individuals, one for adults only and 
one for seedlings and juveniles only. These three data sets 
were analyzed separately in order to allow comparisons 
between age classes. For this purpose, the data set 
containing adults was reduced to retain only the popula- 
tions where juvenile and seedling data was also available. 



Table 1. Location, sample size, and values of genetic diversity parameters; gene diversity (h), allelic richness (A [nj ), and fixation index (F !S) for 
nuclear SSRs and cpSSR of 12 populations of Dysoxylum malabaricum. 

Age class Nuclear SSRs cpSSR 
Latitude Longitude 

Populations Disturbance 1 (°W) (°f) Adults Juveniles Seedlings Total h A[ 14 ] F !S h A[ 8 ) 



1 . Yakambi 


C 


14.85 


75.11 


10 


0 


16 


26 


0.639 


4.118 


0.085 


0.803 


4.643 


2. Jadegadde 


C 


14.80 


74.74 


6 


0 


16 


22 


0.621 


4.339 


0.205 1 


0.909 


5.799 


3. Hittalahalli 


c 


14.79 


74.80 


16 


16 


16 


48 


0.683 


4.809 


0.061 


0.840 


5.006 


4. Navangere 


B 


14.56 


74.94 


19 


11 


8 


38 


0.595 


4.388 


0.176 1 


0.755 


4.526 


5. Tavanandi 


A 


14.46 


74.80 


19 


14 


17 


50 


0.594 


4.260 


0.073 


0.855 


5.167 


6. Sarekoppa 


C 


14.44 


74.51 


10 


16 


16 


42 


0.579 


4.088 


0.140 1 


0.757 


4.059 


7. Agumbe 


A 


13.80 


75.00 


12 


0 


0 


12 


0.623 


4.299 


0.165 


0.788 


4.224 


8. Coorg 


B 


12.40 


75.90 


15 


15 


15 


45 


0.706 


5.253 


0.089 1 


0.809 


4.625 


9. Peria 


C 


11.85 


75.80 


8 


0 


0 


8 


0.733 


5.378 


0.131 


0.429 


2.000 


10. Sholayar 


A 


10.30 


76.71 


19 


0 


0 


19 


0.572 


4.309 


-0.042 


0.696 


3.242 


1 1 . Periyar 


A 


9.53 


77.20 


17 


0 


0 


17 


0.535 


3.761 


0.087 


0.728 


3.668 


Tiger 


























Reserve 


























12. Arippa 


A 


8.50 


76.98 


16 


0 


0 


16 


0.636 


4.642 


0.033 


0.642 


2.996 








Total 


167 


72 


104 


343 


Mean 0.626 


4.470 


0.074 


0.751 


4.163 



1 Disturbance level is categorized as A, low disturbance and >50 individuals; B, medium disturbance; and C, high disturbance and low population size. 
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Data analysis 

Nuclear SSR 

Summary statistics 

Null alleles arising from mutations in primer binding 
regions, or failure of amplification of longer fragments, 
may influence microsatellite genotyping. P-statistics may 
therefore be positively biased by false homozygotes. To 
check for the presence of null alleles, data were analyzed 
with FreeNA (Chapuis and Estoup 2007). FreeNA esti- 
mates unbiased F ST in microsatellite data sets containing 
null alleles using the ENA (excluding null alleles) method, 
which detects unexpected homozygosity patterns. The 
genetic diversity parameters within each population were 
evaluated by determining the gene diversity (h; Nei 1987), 
allelic richness based on seven diploid individuals (A[ 14] ; 
El Mousadik and Petit 1996) and the fixation index (f IS ) 
using the computer program FSTAT 2.9.3 (hereafter, 
FSTAT, Goudet 1995, 2001). The significance of the devi- 
ation of Fis values from 0 was estimated for each locus 
and across the loci for each population on the basis of 
1000 randomizations using FSTAT. Genotypic disequilib- 
rium was tested for all locus pairs in each population by 
randomization. The resulting P-values (=0.05) were 
adjusted applying a sequential Bonferroni correction (Rice 
1989). To evaluate whether the populations examined 
here had experienced recent bottlenecks, we employed the 
BOTTLENECK 1.2.02 software (hereafter, BOTTLENECK 
analysis; Piry et al. 1999; Cornuet and Luikart 1996) 
under both the infinite allele mutation model (IAM) and 
the two-phase model (TPM; 30% of multistep mutation 
and 70% single-step mutation) assumptions. 

Population structure 

The degree of genetic differentiation among populations 
was evaluated by calculating the overall fixation index 
(F s y, Weir and Cockerham 1984) and its confidence 
intervals (95 and 99%), determined on the basis of 1000 
bootstrapping replicates, using FSTAT. Pairwise F ST values 
were also calculated, and the significance of the pairwise 
population differentiation was tested by randomizing 
multilocus genotypes between pairs of populations using 
FSTAT. We also calculated the standardized values of 
Gst> known as G' S t (Hedrick 2005), which ranges from 0 
to 1. Isolation by distance (IBD; Wright 1943) was evalu- 
ated with GenAlEx 6.4 (Peakall and Smouse 2006) using 
Rousset's (1997) method, which tests for statistical associ- 
ation between pairwise population differentiation (F ST / 
(1 — -Fst)) and the natural logarithms of direct minimum 
geographic distance among populations. The genetic rela- 
tionships among populations were evaluated by generating 



a Neighbor-joining (NJ) tree based on the D A genetic 
distances (Nei et al. 1983), using Populations 1.2.30 BETA 
software (Langella 2007). The statistical confidence in the 
topology of the tree was evaluated by 1000 bootstraps 
derived using the same software. The NJ tree was recon- 
structed on a topographic map using Mapmaker and 
GenGIS2 softwares (Parks et al. 2009). 

For inferences on population structure, the software 
STRUCTURE was used (Pritchard et al. 2000; Hubisz 
et al. 2009). It performs Bayesian assignment of individu- 
als to a given number of genetic clusters (K), assuming 
that each cluster is in Hardy-Weinberg and linkage equi- 
libria. Here, K = 1 through K = 15 were investigated 
under the correlated allele frequencies model by running 
100,000 iterations of each K, with a burn-in length of 
100,000 iterations, and averaging the results over 20 runs. 
Data on sampling location was used in the LOCPRIOR 
function, which can further assist the clustering. To help 
determine the optimal K, AK was calculated as described 
by Evanno et al. (2005). The distributions of probability 
of the data (LnP(D)) and the AK values were visualized 
in the STRUCTURE HARVESTER software (Earl and 
vonHoldt 2012). Bar charts for the proportions of the 
membership coefficient of each individual in STRUC- 
TURE analysis over 20 runs for each K were summarized 
using CLUMPP (Jakobsson and Rosenberg 2007) and 
visualized in DISTRUCT (Rosenberg 2004). 

Chloroplast SSR 

Summary statistics 

The haplotype frequencies over populations were visual- 
ized on a map using the "Pies on map" function in 
Genetic Studio (Dyer 2009). The gene diversity (h), hap- 
lotype richness based on eight haploid individuals (H[ 8 ]; 
El Mousadik and Petit 1996) were calculated using the 
CONTRIB software (developed by RJ Petit; http:// 
www.pierroton.inra.fr/genetics/labo/Software/Contrib/). 

Population structure 

The population differentiation measurement, _R ST (Slatkin 
1995; Pons and Petit 1996), which takes into account the 
genetic distance (i.e., number of repeat differences) 
between ordered haplotypes, was calculated. To test 
whether _R ST values were significantly higher than the val- 
ues of unordered haplotype-based G S r, 1000 permutations 
were evaluated in the software Permut & cpSSR (devel- 
oped by RJ Petit). Pairwise R ST values were calculated in 
Arlequin 3.1 and IBD was evaluated with the two matri- 
ces of -R S t/(1 — Rsr) and the natural logarithms of geo- 
graphical distance in GenAlEx. The standardized values of 
G'st were also calculated. An NJ tree based on the (S/J,) 2 
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genetic distances (Goldstein et al. 1995) was constructed 
using the software Populations (Langella 2007) and modi- 
fied in the GenGIS2 software, as done on nSSR data. 

Nuclear and chloroplast SSR 

Comparison of individual-based SGS between nSSRs 
and cpSSR 

As IBD was significant for both genomes (Fig. SI), indi- 
vidual-based SGS was examined. To compare the SGS in 
the nuclear and cp genomes in detail, a spatial autocorre- 
lation analysis was performed separately for the nSSR and 
cpSSR data sets. For nSSR data, multilocus genotypic dis- 
tances between individuals were calculated according to 
Peakall et al. (1995) and then spatial autocorrelation coef- 
ficients, r (Smouse and Peakall 1999), were calculated for 
each distance class of 100 km using GenAlEx. Similarly, 
squared values of repeat number among haplotypes were 
considered as genetic distance and the spatial autocorrela- 
tion was analyzed for cpSSR data. The upper and lower 
95% confidence intervals around r were determined with 
999 bootstraps, and the statistical significance of the auto- 
correlation was tested with 999 permutations. Further- 
more, the heterogeneity in SGS among genomes was 
tested with the single-distance class (f 2 ) and multidistance 
class (to) criteria by the method of Smouse et al. (2008) 
implemented in GenAlEx. 

Demographic history 

Recently, ABC has emerged as a powerful and flexible 
approach to estimate demographic and historical parame- 
ters and to quantitatively compare alternative scenarios 
(Bertorelle et al. 2010). The software DIYABC vl.0.4.39 



(Cornuet et al. 2008) was used to infer past demography. 
Four populations were defined based on the results from 
the STRUCTURE analysis (Fig. 5) and the NJ tree (Fig. 
S2); PopA (population 1-3), PopB (population 4-6), 
PopC (population 7, Agumbe), and PopD (population 8— 
12). We examined three simple demographic scenarios 
(Fig. 2). In scenario 1 PopC (Agumbe population), which 
appeared to be admixed in the STRUCTURE analysis, is 
assumed to have originated from the admixture of PopA 
and PopB at time tl. The rate of admixture from PopA 
to PopC was set as "ra" and the one from PopB to PopC 
was set as "1 - ra". At time tl, PopB merged with PopA 
and the southern PopD merged with PopA and PopB at 
f3. In scenario 2 PopB and PopC were merged with PopA 
at time tl, and PopD was merged with PopA at f3. Finally 
scenario 3 corresponds to a simple population split sce- 
nario where PopA, PopB, PopC, and PopD all merged 
simultaneously at time tl. A population size change of 
the ancestral population (Popl and Pop2) was assumed 
at f3 in each scenario. Importantly, the models assume 
that there is no migration among populations under any 
of the three scenarios. 

As DIYABC requires that one population traces back as 
the ancestral population, PopA was chosen in all scenarios, 
although the level of genetic diversity at SSR loci was 
almost similar in all populations. This arbitrary choice had 
no impact on the results, as additional analyses using differ- 
ent ancestral populations produced consistent results. The 
change in number of repeats followed a generalized step- 
wise mutation model (GSM; Estoup et al. 2002) and single 
nucleotide indels (SNI) were also allowed. The mutation 
rate of the former was assumed to be higher than the muta- 
tion rate of the latter. The default values of the priors were 
used for all parameters (Table SI). The mean values of the 
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expected heterozygosity (H E ) and the number of alleles (A) 
were used as summary statistics for each of the three popu- 
lations, and for population pairs, f ST was also used. One 
million simulations were performed for each scenario, and 
the most likely scenario was evaluated by comparing pos- 
terior probabilities with the logistic regression method. The 
goodness-of-fit of the three scenarios were also assessed by 
a principal component analysis (PCA) using the option 
"model checking" in DIYABC. 

Results 
Nuclear SSRs 

Genotyping of nSSRs was successful in 86% of the cases 
and only three individuals did not amplify at any locus. 
Missing data was mainly confined to the two seedling 
groups of Navangere and Hittalahalli, where the success 
rate was only 14% and 19%, respectively, probably 
because of difficulties to keep the raw materials in good 
condition in the field and extract purified DNA from the 
samples of seedlings. As there was no significant differ- 
ence between the three age classes in our initial analysis, 
we pooled the different age classes in each location, lead- 
ing to a dataset with twelve populations. 

Summary statistics 

As only a small amount of null alleles was detected and 
the P S t values corrected for null alleles were almost the 
same as the uncorrected ones at all loci (Table S2), we 
used the original genotype data. Gene diversity and allelic 
richness were roughly similar between populations and 
were not significantly correlated with latitude (Table 1 
and Fig. 3). The fixation index (F IS ) value did not deviate 
significantly from zero at any locus in any population. 
However, significant positive overall values (P < 0.05) 
were detected in Jadegadde (2), Navangere (4), Sarekoppa 
(6) and Coorg (8) populations when the overall values 
were calculated. The F IS was higher in the northern popu- 
lations and the correlation with latitude was significant 
(P < 0.05). Genotype disequilibrium was significant at only 
one out of the 55 locus pairs (P < 0.05). In the BOTTLE- 
NECK analysis, a significant H E excess (P < 0.05) was 
detected in the northernmost populations, Yakambi (1) 
population under the IAM, while significant H E deficits 
were detected in Navangere (4) and Sholayar (10) popula- 
tions under the TPM (P < 0.05). 

Population structure 

The population differentiation indices, P ST and G' ST , were 
0.09 and 0.33, respectively. Although IBD was clear 



(Fig. SI), the variance in P S t between closely located 
populations remains high, suggesting that there are 
several combinations of neighboring populations with 
strong genetic differentiation. In the STRUCTURE analy- 
sis, the probability of the data (LnP(D)) increased pro- 
gressively up to K = 8 where it started to plateau 
(Fig. 4A). Indeed, several single populations (e.g., popula- 
tions 3, 8, 10, 11, and 12) were assigned to specific clus- 
ters for higher values of K (Fig. 5). However, the 
clustering pattern for values of K > 4 showed complicated 
multimodality, that is, the assignment of individuals to 
clusters is inconsistent between runs, which indicates that 
such models are difficult to fit to the data. On the other 
hand, AK indicated that the optimal number of K was 
three (Fig. 4B). We therefore restricted further analyses to 
K = 3 (Fig. 5). For K = 3, the five southernmost popula- 
tions, distributed over more than half of the species dis- 
tribution range, form a separate cluster. The 
northernmost populations divide into two clusters. The 
overall pattern corresponds well with a latitudinal gradi- 
ent of populations. The Agumbe population was not fully 
assigned to any of the three clusters; rather, it appears to 
be admixed (Fig. 5). Although several nodes were poorly 
supported by bootstraps, the result of the NJ tree showed 
a similar pattern as the STRUCTURE analysis (Fig. S2). 
Contrary to our expectations, the two geographic gaps 
did not have a clear effect on population genetic struc- 
ture. Instead, a genetic gap was detected north of the two 
geographic gaps. 

Chloroplast SSR 

Summary statistics 

A total of twenty-four alleles were detected in ccmp7 
(Fig. 1 and Table S3). All individuals were successfully 
amplified at this locus. The allelic richness showed a sig- 
nificant correlation with latitude, with highest values in 
the north (Fig. 3 and Table 1). The same trend was 
observed for gene diversity, but the cline was not signifi- 
cant. The distribution of haplotypes was well ordered 
with allele length increasing from north to south. 

Population structure 

The haplotype distribution pattern was mirrored in the 
NJ tree showing two northern and southern groups (Fig. 
S2). Significant IBD was found (Fig. SI) and also, alleles 
seemed ordered in one northern and one southern lineage 
(Fig. 1). Accordingly, P ST was significantly higher than 
G ST (0.689 and 0.203, respectively, P < 0.01) indicating 
the presence of a phylogeographic structure. The level of 
population differentiation obtained with G' S t was 0.87, 
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Figure 3. Relationship between genetic diversity parameters (allelic richness, gene diversity, and fixation index) and latitude, and longitude for 12 
Dysoxylum malabaricum populations in nuclear and chloroplast SSRs. 



which is higher than the value detected with nSSRs 
(G' ST = 0.33). Overall, the genetic structure in the cpSSR 
was clearly different from the one detected in nSSRs, but 
was in line with the nSSR in that there was no relation- 
ship with the two geographical gaps (Figs. 1 and S2). 

Nuclear and chloroplast SSRs 

Comparison of individual-based SGS between 
nSSRs and cpSSR 

The SGS pattern was different between the genomes and 
the values of spatial autocorrelation coefficients and r 
were positively significant from the first to the third 
distance classes in the cpSSR while it was significant only 
in the first distance class for nSSR. Significantly negative 



values were reached in the sixth and seventh distance class 
for the cpSSR. The heterogeneity tests showed that the 
entire SGS was different between the two genomes 
(co = 42.120; P = 0.001, Fig. 6). This indicates that the 
nuclear genome is more locally structured than the cp 
one, as spatial autocorrelation was found over larger 
distances in the cp marker. 

Demographic history 

Scenario 3, which corresponds to a simple simultaneous 
split of the four populations had by far the highest pos- 
terior probability (0.6146, 95%, CI = 0.5596-0.6697) 
(Table 2). The median values of the effective population 
size were 9230, 6220, 5610, 9040, 6770, and 4810, for N A 
(PopA), N B (PopB), N c (PopC), N D (PopD), Nj (Popl) 
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Figure 5. The proportion of the membership 
coefficient for each individual in 12 Dysoxylum 
malabaricum populations for the inferred 
clusters in K = 2-8 by the STRUCTURE analysis. 
The definitions of the four populations used in 
the DIYABC (PopA-D) are also shown. 
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and the ancestral N 2 (Pop2), respectively, (Table S4). The 
median values of the divergence time, f2 and the time of 
population size change, f3 were 1760 and 4340 genera- 
tions ago, respectively. If we assume a generation time of 
25 years (Ravikanth et al. 2008, G. Ravikanth, pers. obs.), 
the divergence time of the three populations would be 



44,000 years ago and the time of the ancestral population 
size change 108,500 years ago. However, the posterior dis- 
tribution pattern suggested that f3 is poorly estimated 
(Fig. S3). The median value of the mutation rate of SSR 
and SNI at examined loci were estimated at 5.09 x 10~ 4 
and 2.38 x 10~ 5 , respectively. Observed values of the 
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Table 2. Posterior probability of each scenario and its 95% confi- 
dence interval based on the logistic estimate by DIYABC. 
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Figure 6. Spatial genetic structure in nuclear and chloroplast SSRs 
evaluated by the autocorrelation coefficient, r (Smouse and Peakall 
1999). The asterisk marks the distance classes showing a significant 
coefficient. 

expected H E , the number of alleles (A) in each population 
and H E , A and F ST for all possible combinations of popu- 
lation pairs did not differ significantly from simulated 
values based on parameters values drawn from the poster- 
ior distributions for scenario 3. In the PC A, the observed 
data (large yellow dot) is among the values obtained from 
the posterior distribution (large blue dots), the small dots 
corresponding to the prior distribution. This indicates a 
good fit of the posterior distribution based on scenario 3 
to the data (Fig. S4). 

Discussion 

Genetic diversity of D. malabaricum 

In the present study we investigated genetic diversity and 
population structure across the range of D. malabaricum 
in the Western Ghats using both nuclear and cpSSRs. 
Firstly, as regeneration was poor, potentially because of 
overharvesting or newly generated niche competition 
induced by human land use, we aimed to compare 
genetic diversity among age classes, expecting lower 
genetic diversity in younger individuals. However, no dif- 
ference was found between them. This pattern was also 
detected in a fine-scale genetic study of D. malabaricum 
in the southern part of the Western Ghats (Ismail et al. 
2012). Indeed, as trees are typically harvested after about 
80 years, old adult trees are still relatively frequent (Y. 
Tsuda and G. Ravikanth, field obs.). Thus, these older 
trees would still contribute to the reproductive success 
and help maintain the genetic diversity in younger 
cohorts. In addition, as clusters of populations were 
detected by the STRUCTURE analysis, sufficient gene 
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flow between populations within clusters may occur, and 
it would also contribute to the maintenance of genetic 
diversity in individual populations. Although gene diver- 
sity and allelic richness did not show geographic patterns 
in nSSRs, the F IS values were significantly higher in the 
northern populations than in the southern populations, 
suggesting that inbreeding is higher in the north. This 
pattern gives support to the classification of populations 
according to their perceived level of disturbance as popu- 
lations classified as highly disturbed (category C) are con- 
centrated to the north, whereas most of the least 
disturbed populations (category A) are found in the south 
(Table 1). Even in the southern part of the Western Ghats 
(around the Coorg population which showed significant 
deviation of F ls from 0 in this study), a recent study on 
parentage and kinship analysis of D. malabaricum in a 
fragmented agro-forestry landscape (Ismail et al. 2012) 
found that pollen dispersal between sacred groves pre- 
vented the build-up of inbreeding at different life stages 
(adults, sapling, seedlings, and embryos). Data from both 
southern and northern populations hence suggest that the 
higher values of F IS in the northern populations detected 
in this study could indeed be due to inbreeding induced 
by small population sizes and relative isolation at the 
margin of the species together with serious disturbance. 
This interpretation is further supported by the detection 
of a bottleneck in the northern most population, Yakambi 
(1) and by the observation of a smaller number of adult 
trees in this population (<20 adult trees) than in other 
northern marginal populations (e.g., Hittalahalli, Navang- 
ere, and Sarekoppa) (>60 individuals, Y. Tsuda and G. 
Ravikanth, field obs.). 

Population structure and demographic 
history 

Three and two genetic clusters were observed with nSSRs 
and cpSSR, respectively, but notably, the Palghat and 
Shenocottah gaps did not appear to be reproductive bar- 
riers for either genome. The distributions of the clusters 
were not congruent, for nuclear and cp markers and an 
explanation could be that population demographic history 
differs between the genomes, reflecting differences in 
inheritance and dispersal modes. The cpSSR revealed a 
clear geographical gradient in distribution of ordered 
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haplotypes, which is likely informative although only one 
locus was examined. 

Genetic clusters in the nuclear genome appear well 
defined over all populations except Agumbe (7), in which 
an admixture-like pattern was detected by the STRUC- 
TURE analysis. However, the ABC analysis suggested that 
the most likely scenario was a simple population split into 
four population groups. The divergence time of the four 
groups was estimated to 1760 generations ago, or 
44,000 years ago when we assume a generation time of 
25 years for D. malabaricum. Estimating generation time 
remains difficult in species with long reproductive spans 
like forest trees (Petit and Hampe 2006). While D. mala- 
baricum bears first fruits at 12-15 years of age it only 
reaches the canopy after around 25 years at which stage 
the trees acquire their full reproductive potential. If we 
assume that the generation time is around 25 years then 
the divergence time predates the last glacial maximum 
(LGM, 26,500-19,000 years before present) (Clark et al. 
2009). During the LGM the Western Ghats shifted toward 
colder and drier climate than today, severely affecting the 
distribution of rainforests (Farooqui et al. 2010). In India, 
rainforests spread from isolated pockets only 4000- 
7000 years ago (Farooqui et al. 2010). Thus, these facts 
together with the results of the ABC analysis suggested 
that the divergence of D. malabaricum occurred before 
the LGM, and that refugia were formed in several places. 
Our results from the STRUCTURE and ABC analysis sup- 
port a model where modern populations of D. malabari- 
cum were founded from one refugium in the south and 
two or three refugia in the north. Indeed, one such rain- 
forest refugium has been found in southern Western 
Ghats in a palynological study (Farooqui et al. 2010). 
Also, although D. malabaricum has a clinal distribution, 
important environmental variables such as mean annual 
rainfall and length of dry season, remain largely constant 
throughout the range (Prasad et al. 2009). In conclusion, 
the overall population genetic structure was therefore 
likely formed by historical gene flow and past climatic 
events rather than recent human activities and adaptation 
to local climate. 

The finding that an instant split into four groups fit 
the data better than a model with three groups, where 
two of them merged to found a fourth, admixed popula- 
tion is in itself interesting. This suggests that the Agumbe 
population is a unit of its own, rather than an admixed 
population. This interpretation is supported by a recent 
study where ancestral polymorphism was explicitly taken 
into account in the case of freshwater fish (Sousa et al. 
2012). In this study ABC analysis was performed on sim- 
ulated and empirical data, and it was found that the 
genetic structure was better explained by a population 
split model without admixture (a similar model to 



scenario 3 in the present study) than by a model with 
admixture, even when STRUCTURE analysis showed an 
admixture-like pattern. In fact, one of the issues with 
inferences of population demography is that ancestral 
shared polymorphisms are often difficult to separate from 
admixture or gene flow (Sousa et al. 2012), and are extre- 
mely common in trees (e.g., Chen et al. 2010). Thus, the 
ABC approach in the present study provides demographic 
information for the conservation of relevant units in the 
target species. However, there are some caveats. As 
pointed out earlier the ABC analysis implemented in 
DIYABC assumes no gene flow among populations. The 
rather low level of population differentiation (overall 
G' ST = 0.33) suggests that there may be some gene flow, 
although the clear STRUCTURE results suggest that 
recent gene flow may not be too important. Therefore, 
estimates of divergence time and effective population sizes 
will likely be biased downwards and upwards, respectively. 
However, the main results of pre-LGM divergence would 
not be changed even when gene flow is taken into 
account as ignoring gene flow would lead to underesti- 
mating the divergence time. 

Additional insights on the demographic history of 
D. malabaricum can be gained by comparing nSSRs and 
cpSSR. Since D. malabaricum is insect pollinated and seeds 
are dispersed via birds, one would expect more genetic 
structure in the nuclear genome than in the cp one (Petit 
et al. 2005), which was indeed found. The Coorg (8) pop- 
ulation primarily belonged to the southern cluster in the 
STRUCTURE analysis, while it was assigned to the north- 
ern group based on cpSSR variation. This suggests that 
organelle capture occurred between the northern and the 
southern lineages in the middle of the species range by 
repeated hybridization and backcross events among them 
through pollen flow from the southern lineage. However, 
it should be noted that we cannot rule out the possibility 
that the contrasting patterns could stem from a difference 
in mutation rates between the cp and nuclear genomes. 

Although there was some discrepancy in geographic dis- 
tributions of genetic groups between nSSRs and cpSSRs 
groups, both genomes showed clear clusters, suggesting 
reproductive barriers in the species distribution range and 
historically limited gene flow among them. However, the 
Palghat and Shencottah gaps did not constitute such barri- 
ers. Regarding the seed flow, although the feeding behav- 
ior of hornbills tends to cause aggregation of individuals, 
they have been observed to fly over unsuitable habitats 
between fragments and sometimes even migrate several 
hundred kilometers (Raman and Mudappa 2003). This 
ability is apparently sufficient to maintain high level of 
genetic connectivity between populations within the 
northern and southern cpSSR groups of D. malabaricum. 
Therefore, it may not come as a surprise that the 40-kilo- 
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meter wide Palghat gap did not constitute a barrier to 
gene flow. On the other hand, while the longest pollen dis- 
persal distance was 23.6 km, the mean pollen dispersal 
was 1205 m in high-density stands and 600 m in low- 
density stands, via beetles and thrips (Ismail et al. 2012). 
Thus, pollen dispersal in this species appears more 
restricted than seed dispersal. The results of the SGS analy- 
sis supported this and showed wider significant spatial 
autocorrelation in the cpSSR than in nSSRs. Moreover, in 
spite of multimodalities in the STRUCTURE analysis, clear 
clustering was also found and several single populations 
were assigned to specific clusters for number of K as high 
as K = 8 and this genetic differentiation at a local scale 
also suggested limited pollen flow among populations. 

Implications for conservation 

The geographic patterns of genetic diversity of D. mala- 
baricum that was found appears to reflect the species' nat- 
ural population history rather than recent human impact. 
However, although clear loss of genetic diversity by 
human activities was not suggested by this study, ongoing 
activities (e.g., harvesting and deforestation) and its sec- 
ondary impact (niche competition in the ecosystem) as 
reflected in the imbalance between age classes are still 
serious concerns for the long-time survival of the species. 
Especially, evidence of inbreeding in the northern popula- 
tions, which are also among the most disturbed ones, and 
of a recent bottleneck in the Yakambi population, may be 
the first signals of unsuccessful regeneration due to 
human activities. Our study hence suggests that conserva- 
tion priority should be given to these northern popula- 
tions and it is recommended to start efforts to evaluate 
regeneration dynamics in this region. This would entail 
fine-scale analysis of genetic structure among age classes. 
Our data also suggest the presence of long distance seed 
dispersal at least within the two groups detected with 
cpSSR. Given the importance of long-distance dispersal in 
D. malabaricum, it is desirable to maintain populations of 
its seed disperser, the Malabar grey hornbill. Luckily, this 
species has recently been listed by the IUCN as "Least 
concern" (BirdLife International 2012). However, as dis- 
cussed above, the distributions of haplotypes were largely 
different between the northern and southern groups. 
Therefore, more populations would be needed between 
the two groups to evaluate the distribution of the genetic 
barriers of D. malabaricum. 

Wide-scale genetic structure studies have provided use- 
ful information on conservation units and seed zones and 
guidelines for restoration or plantation efforts in forest 
conservation and tree breeding programs of many tree 
species, especially economically important ones (Bucci 
and Vendramin 2000; Lefevre 2004; Tsuda and Ide 2005; 



Sutherland et al. 2010). This information provided by 
wide-scale genetic structure could also be relevant to the 
conservation of D. malabaricum. Based on the present 
study, we suggest that D. malabaricum should be man- 
aged as four separate units, according to the clusters 
found with nSSRs. Although the population structure of 
cpSSR was not congruent with this, priority should be 
given to the results of the nSSRs for two reasons; the 
inference is based on a larger number of independent loci, 
and it revealed more fine-scale diversity which might be 
lost if the species were to be managed as only two units. 
Therefore, the minimum ambition should be to conserve 
one population each from the four clusters as a represen- 
tative for their respective genetic diversity. The status of 
the Agumbe population as a separate unit is not fully set- 
tled but should it be an admixed population, it still har- 
bors genetic diversity from all the three other known 
lineages and thus specific conservation efforts would be 
justified. Although currently large-scale transplantation or 
restoration of individuals may not be needed in order to 
mitigate inbreeding over most of the species' range, the 
proposed conservation units would be informative for 
adaptive management of this natural resources. Moreover, 
it is required to evaluate the ecological dynamics of not 
only the target species but also the whole ecosystem asso- 
ciated to it and to design a more practical approach of 
ecosystem management in this biodiversity hotspot. 

Data Accessibility 

DIYABC input file and genotype data have been uploaded 
to DRYAD doi: 10.5061/dryad.g2bl0. 

Prior distributions of parameters used in DIYABC and 
distribution of cp haplotypes over the twelve populations 
and have been uploaded online as supporting information. 
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Supporting Information 

Additional Supporting Information may be found in the 
online version of this article: 

Figure SI. The relationships between the matrix of pair- 
wise differentiation described as F S r/(l — -Fst) for nuclear 
SSRs (A) and £ ST /(1 - R sr ) for chloroplast SSR (B), and 
the matrix of the natural logarithm of geographic distance 
(in km) in the 12 populations. 

Figure S2. The population-based neighbor-joining tree in 
nuclear SSRs (A, D A distance, Nei et al. 1983) and chloro- 
plast SSR (B, (Sfi) 2 distance, Goldstein et al. 1995). The boot- 
strap values were shown when the value exceeds 50% in (A). 



Figure S3. The prior and posterior distributions for each 
parameter obtained by DIYABC. 

Figure S4. The principal component analysis (PCA) 
obtained by DIYABC. 

Table SI. Prior distributions of the parameters used in 
DIYABC 

Table S2. F ST values per locus with and without correc- 
tion for null alleles. 

Table S3. Distribution of chloroplast haplotypes (alleles 
in ccmp7) in each of the 12 populations of Dysoxylum 
malabaricum. 

Table S4. Demographic parameters obtained by DIYABC. 
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